Methods for concurrent identification and quantification of an unknown bioagent

ABSTRACT

The present invention provides methods for the quantification of an unknown bioagent in a sample by amplification of nucleic acid of the bioagent, and concurrent amplification of a known quantity of a calibration polynucleotide from which are obtained a bioagent identifying amplicon and a calibration amplicon. Upon molecular mass analysis, mass and abundance data are obtained. The identity of the bioagent is then determined from the molecular mass of the bioagent identifying amplicon and the quantity of the identified bioagent in the sample is determined from the abundance data of the bioagent identifying amplicon and the abundance data of the calibration amplicon.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority to U.S. Non-Provisional application Ser. No. 11/059,776 filed Feb. 17, 2005, U.S. Provisional Application Ser. No. 60/545,425 filed Feb. 18, 2004, and U.S. Provisional Application Ser. No. 60/559,754 filed Apr. 5, 2004, each of which is incorporated herein by reference in its entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with United States Government support under contract MDA972-00-C-0053 awarded by DARPA/SAIC. The United States Government has certain rights in the invention.

FIELD OF THE INVENTION

The present invention is related generally to nucleic acid amplification technology and microbiology.

BACKGROUND OF THE INVENTION

Information about the identity and total amount of microbes in biological samples is of prime importance in medicine in order to assess the risk of infectious disease, to diagnose infections and predict their clinical course. In a variety of other areas such as food product monitoring, bioremediation, microbial forensics and biowarfare/bioterror investigations, efficient and cost effective methods for quantification of microbial bioagents are needed. In addition, determination of the quantity of a bioagent (microbe, bacterium, virus, fungus, etc.) is a common endeavor in microbiology in the fields of clinical diagnostics, epidemiology, forensics, bioremediation, and quality control.

Methods currently in use for detection and determination of bacteria include bacterial culture and microscopy, detection of bacterial metabolites, and identification of surface molecules by specific antibodies.

The polymerase chain reaction (PCR) is only a qualitative method due to its exponential time course and equally exponential amplification of errors. Efforts have been made to convert PCR to a quantitative method. Among the variety of quantitative PCR methods, are methods depending upon external standardization and on internal standardization. Among the latter, competitive PCR methods are based on co-amplification of a target DNA with a standard competitor DNA which competes with the template DNA for the same set of amplification primers. Since the competitor is added to the PCR reaction mixture in known amounts, it is possible to calculate the amount of target DNA from the experimental determination of the ratio of amplified products of sample and standard competitor DNA.

Methods for rapid and cost effective identification of microbial bioagents through molecular mass measurement of amplification products by molecular mass analysis of bioagent identifying amplicons are disclosed and claimed in U.S. application Ser. Nos. 09/798,007, 09/891,793, 10/660,997, 10/660,122, 10/660,996, 10/418,514 and 10/728,486, each of which is commonly owned and incorporated herein by reference in its entirety. These methods and others would derive great benefit from a means of determination of the quantity of any given microbial bioagent present in a biological sample. Quantification of organisms can be very valuable, particularly in a clinical setting, like Hepatitis C for example, where the greater the number of infectious organisms generally correlates with a less healthy patient and a more difficult clinical course.

The methods described herein satisfy the need for methods for concurrent identification and quantification of bioagents, as well as other needs, by providing internal calibration using a nucleic acid standard calibrant in an amplification reaction.

SUMMARY OF THE INVENTION

The present invention provides methods for determination of the quantity of an unknown bioagent in a sample by contacting the sample with a pair of primers and a known quantity of a calibration polynucleotide that comprises a calibration sequence. Nucleic acid from the bioagent in the sample is concurrently amplified with the pair of primers and amplifying nucleic acid from the calibration polynucleotide in the sample with the pair of primers to obtain a first amplification product comprising a bioagent identifying amplicon and a second amplification product comprising a calibration amplicon. The sample is then subjected to molecular mass analysis resulting in molecular mass and abundance data for the bioagent identifying amplicon and the calibration amplicon. The bioagent identifying amplicon is distinguished from the calibration amplicon based on molecular mass wherein the molecular mass of the bioagent identifying amplicon provides a means for identifying the bioagent. Comparison of bioagent identifying amplicon abundance data and calibration amplicon abundance data indicates the quantity of bioagent in the sample.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a representative process diagram for identification and determination of the quantity of a bioagent in a sample.

FIG. 2 shows a representative mass spectrum of a viral bioagent identifying amplicon for the RdRp primer set of the SARS coronavirus (SARS) and the corresponding RdRp calibration amplicon.

FIG. 3 shows a representative mass spectrum of an amplified nucleic acid mixture containing the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide vector which includes the CapC calibration sequence for Bacillus anthracis and primer pair 350 (see Example 4).

The figures depict preferred embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.

DESCRIPTION OF EMBODIMENTS

The present invention provides methods for identification and determination of the quantity of a bioagent in a sample. Referring to FIG. 1, to a sample containing nucleic acid of an unknown bioagent are added primers (100) and a known quantity of a calibration polynucleotide (105). The total nucleic acid in the sample is then subjected to an amplification reaction (110) to obtain amplification products. The molecular masses of amplification products are determined (115) from which are obtained molecular mass and abundance data. The molecular mass of the bioagent identifying amplicon (120) provides the means for its identification (125) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (130) provides the means for its identification (135). The abundance data of the bioagent identifying amplicon is recorded (140) and the abundance data for the calibration data is recorded (145), both of which are used in a calculation (150) which determines the quantity of unknown bioagent in the sample. Each of these features is described below in greater detail.

In one embodiment, a sample comprising an unknown bioagent is contacted with a pair of primers which can amplify nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence. The nucleic acids of the bioagent and of the calibration sequence are amplified. The rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence. The amplification reaction produces two amplification products: a bioagent identifying amplicon and a calibration amplicon. The amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example. The resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence. The molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample. The calculations are well within the scope of those of the ordinary artisan.

A calibration sequence is a sequence chosen to represent a portion of a genome of a bioagent (bacterium, virus etc.) that can be amplified by a particular primer pair to yield an amplification product (calibration amplicon) that can be distinguished on the basis of its molecular mass from an analogous amplification product (bioagent identifying amplicon) obtained by amplification of native DNA of a bioagent (bacterium, virus, etc) with the same pair of primers. One means of distinguishing an amplification product of a calibration sequence vs. a bioagent identifying amplicon is to design the calibration sequence so that, upon amplification, it gives rise to an amplification product consisting of a calibration amplicon that has a molecular mass distinguishable from the analogous bioagent identifying amplicon. This is desired because, as in any internally calibrated method, the calibration sequence and the bioagent sequence are amplified concurrently in the same amplification reaction vessel.

In some embodiments, construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied, provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample. The use of standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.

In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple intelligent primer pairs which also amplify the corresponding standard calibration sequences. In this or other embodiments, the standard calibration sequences are optionally included within a single vector such as a plasmid which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.

In some embodiments, the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide can give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination.

In some embodiments, the calibration sequence is inserted into a vector which then itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide. The process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the present invention should not be limited to the embodiments described herein. The present invention can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is designed and used. The process of choosing an appropriate vector such as a plasmid for insertion of a calibrant is also a routine operation that can be accomplished by one with ordinary skill without undue experimentation.

In some embodiments of the present invention, determination of the molecular masses of the bioagent identifying amplicon and the calibration amplicon is accomplished using mass spectrometry. Exemplary techniques of mass spectrometry include, but are not limited to, electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI-FTICR-MS) and electrospray ionization time-of-flight mass spectrometry (ESI-TOF-MS).

In some embodiments, bioagent identifying amplicons and calibration amplicons are of a length between about 45-200 base pairs. One will recognize that these embodiments comprise bioagent identifying amplicons and calibration amplicons of lengths of about 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, or 200 base pairs, or any range therewithin.

In other embodiments, bioagent identifying amplicons and calibration amplicons are of a length between about 45-140 base pairs. One will recognize that these embodiments comprise bioagent identifying amplicons and calibration amplicons of lengths of about 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, or 140 base pairs, or any range therewithin.

In some embodiments, the primers used to obtain bioagent identifying amplicons and calibration amplicons upon amplification hybridize to conserved regions of nucleic acid of genes encoding proteins or RNAs necessary for life which include, but are not limited to: 16S and 23S rRNAs, RNA polymerase subunits, t-RNA synthetases, elongation factors, ribosomal proteins, protein chain initiation factors, cell division proteins, chaperonin groEL, chaperonin dnaK, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, and DNA topoisomerases.

Calibration sequences can be routinely designed without undue experimentation by choosing a reference sequence representing any bioagent identifying amplicon that can be amplified by a specific pair of primers of any class e.g: broad range survey, division-wide, clade level, or drill down or any arbitrarily named class of primer and by deleting or inserting about 2-8 consecutive nucleobases into that sequence such that the calibration sequence is distinguishable by molecular mass from the reference sequence upon which the calibration sequence is based. One will recognize that this range comprises insertions or deletions of 2, 3, 4, 5, 6, 7, or 8 nucleobases. In other embodiments, the total insertion or deletion of consecutive nucleobases may also exceed 8 nucleobases. In other embodiments, the total insertion or deletion of consecutive nucleobases results in a calibration sequence having at least 80%, at least 85%, at least 90%, or at least 95% sequence identity with a chosen standard sequence of a bioagent identifying amplicon.

In some embodiments, the primers used for amplification of bioagent identifying amplicons and calibration amplicons hybridize to and amplify genomic DNA, DNA of bacterial plasmids or DNA of DNA viruses.

In some embodiments, the primers used for amplification of bioagent identifying amplicons and corresponding calibration amplicons hybridize directly to ribosomal RNA or messenger RNA (mRNA) and act as reverse transcription primers for obtaining DNA from direct amplification of bacterial rRNA. Methods of amplifying RNA using reverse transcriptase are well known to those with ordinary skill in the art and can be routinely established without undue experimentation.

Synthesis of primers is well known and routine in the art. The primers may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, Calif.). Any other means for such synthesis known in the art may additionally or alternatively be employed.

The primers can be employed as compositions for use in methods for identification of bacterial bioagents as follows: a primer pair composition is contacted with nucleic acid of an unknown bacterial bioagent. The nucleic acid is then amplified by a nucleic acid amplification technique, such as PCR for example, to obtain an amplification product that represents a bioagent identifying amplicon. The molecular mass of a single strand or each strand of the double-stranded amplification product is determined by a molecular mass measurement technique such as mass spectrometry for example, wherein the two strands of the double-stranded amplification product are separated during the ionization process. In some embodiments, the mass spectrometry is electrospray Fourier transform ion cyclotron resonance mass spectrometry (ESI-FTICR-MS) or electrospray time of flight mass spectrometry (ESI-TOF-MS). A list of possible base compositions can be generated for the molecular mass value obtained for each strand and the choice of the correct base composition from the list is facilitated by matching the base composition of one strand with a complementary base composition of the other strand. The molecular mass or base composition thus determined is then compared with a database of molecular masses or base compositions of analogous bioagent identifying amplicons for known bioagents. A match between the molecular mass or base composition of the amplification product and the molecular mass or base composition of an analogous bioagent identifying amplicon for a known bioagent indicates the identity of the unknown bioagent. In some embodiments, the method is repeated using a different primer pair to resolve possible ambiguities in the identification process or to improve the confidence level for the identification assignment.

In some embodiments, a bioagent identifying amplicon or a calibration amplicon may be produced using only a single primer composition (either the forward or reverse primer of any given primer pair), provided an appropriate amplification method is chosen, such as, for example, low stringency single primer PCR (LSSP-PCR).

In some embodiments, the oligonucleotide primers are “broad range survey primers” which hybridize to conserved regions of nucleic acid encoding ribosomal RNA (rRNA) of at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 99%, or all known bacteria and produce bacterial bioagent identifying amplicons. As used herein, the term “broad range survey primers” refers to primers that bind to nucleic acid encoding rRNAs of at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 99%, or all known species of bacteria. In some embodiments, the rRNAs to which the primers hybridize are 16S and 23S rRNAs.

In some cases, the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at the species level. These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional “division-wide” primer pair (vide infra). The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as “triangulation identification” (vide infra).

In other embodiments, the oligonucleotide primers are “division-wide” primers which hybridize to nucleic acid encoding genes of broad divisions of bacteria such as members of the Bacillus/Clostridia group or members of the α-, β-, γ-, and ε-proteobacteria. In some embodiments, a division of bacteria comprises any grouping of bacterial genera with more than one genus represented. For example, the β-proteobacteria group comprises members of the following genera: Eikenella, Neisseria, Achromobacter, Bordetella, Burkholderia, and Raltsonia. Species members of these genera can be identified using bacterial bioagent identifying amplicons generated with a primer pair which produces a bacterial bioagent identifying amplicon from the tufB gene of β-proteobacteria. Examples of genes to which division-wide primers may hybridize to include, but are not limited to: RNA polymerase subunits such as rpoB and rpoC, tRNA synthetases such as valyl-tRNA synthetase (valS) and aspartyl-tRNA synthetase (aspS), elongation factors such as elongation factor EF-Tu (tufB), ribosomal proteins such as ribosomal protein L2 (rplB), protein chain initiation factors such as protein chain initiation factor infB, chaperonins such as groL and dnaK, and cell division proteins such as peptidase ftsH (hflB).

In other embodiments, the oligonucleotide primers are designed to enable the identification of bacteria at the clade group level, which is a monophyletic taxon referring to a group of organisms which includes the most recent common ancestor of at least 70%, at least 80%, at least 90%, or all of its members and at least 70%, at least 80%, at least 90%, or all of the descendants of that most recent common ancestor. The Bacillus cereus clade is an example of a bacterial clade group.

In other embodiments, the oligonucleotide primers are “drill-down” primers which enable the identification of “sub-species characteristics.” These primers can hybridize to conserved regions of nucleic acid of genes encoding structural proteins or proteins implicated in, for example, pathogenicity. Examples of genes indicating sub-species characteristics include, but are not limited to: toxin genes, pathogenicity markers, antibiotic resistance genes and virulence factors. Drill down primers provide the functionality of producing bacterial bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with bacterial nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of bacterial infections.

It is, thus, readily apparent that one with ordinary skill can design calibration sequences that can be amplified by any of the primer classes disclosed herein in order to produce appropriate calibration amplicons.

One with ordinary skill in the art of design of amplification primers will recognize that a given primer need not hybridize with 100% complementarity in order to effectively prime the synthesis of a complementary nucleic acid strand in an amplification reaction. Moreover, a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event. (e.g: a loop structure or a hairpin structure). The primers of the present invention may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99% sequence identity with any of the primers listed in Table 1. Thus, in some embodiments of the present invention, an extent of variation of 70% to 100% of the sequence identity is possible relative to the specific primer sequences disclosed herein. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is identical to another 20 nucleobase primer having two non-identical residues has 18 of 20 identical residues (18/20=0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of primer 20 nucleobases in length would have 15/20=0.75 or 75% sequence identity with the 20 nucleobase primer.

Percent homology, sequence identity or complementarity, can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, Madison Wis.), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). In some embodiments, complementarity of primers with respect to the conserved priming regions of bacterial nucleic acid, is between about 70% and about 80%. In other embodiments, homology, sequence identity or complementarity, is between about 80% and about 90%. In yet other embodiments, homology, sequence identity or complementarity, is about 90%, about 92%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99% or about 100%.

One with ordinary skill is able to calculate percent sequence identity or percent sequence homology and able to determine, without undue experimentation, the effects of variation of primer sequence identity on the function of the primer in its role in priming synthesis of a complementary strand of nucleic acid for production of an amplification product of a corresponding bioagent identifying amplicon.

In some embodiments of the present invention, the oligonucleotide primers are between 13 and 35 nucleobases in length (13 to 35 linked nucleotide residues). These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, or any range therewithin.

In some embodiments, any given primer comprises a modification comprising the addition of a non-templated T residue to the 5′ end of the primer i.e: the added T residue does not necessarily hybridize to the nucleic acid being amplified. The addition of a non-templated T residue has the effect of minimizing the addition of non-template A residues as a result of the non-specific enzyme activity of Taq polymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.

In some embodiments of the present invention, primers may contain one or more universal bases. Because any variation (due to codon wobble in the 3^(rd) position) in the conserved regions among species is likely to occur in the third position of a DNA triplet, oligonucleotide primers can be designed such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a “universal nucleobase.” For example, under this “wobble” pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C. Other examples of universal nucleobases include nitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog 1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide (Sala et al., Nucl. Acids Res., 1996, 24, 3302-3306).

In some embodiments, to compensate for the somewhat weaker binding by the “wobble” base, the oligonucleotide primers are designed such that the first and second positions of each triplet are occupied by nucleotide analogs which bind with greater affinity than the unmodified nucleotide. Examples of these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil which binds to adenine and 5-propynylcytosine and phenoxazines, including G-clamp, which binds to G. Propynylated pyrimidines are described in U.S. Pat. Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly owned and incorporated herein by reference in its entirety. Propynylated primers are described in U.S. Ser. No. 10/294,203 which is also commonly owned and incorporated herein by reference in entirety. Phenoxazines are described in U.S. Pat. Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety. G-clamps are described in U.S. Pat. Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.

In some embodiments, non-template primer tags are used to increase the melting temperature (T_(m)) of a primer-template duplex in order to improve amplification efficiency. A non-template tag is designed to hybridize to at least three consecutive A or T nucleotide residues on a primer which are complementary to the template. In any given non-template tag, A can be replaced by C or G and T can also be replaced by C or G. The extra hydrogen bond in a G-C pair relative to a A-T pair confers increased stability of the primer-template duplex and improves amplification efficiency.

In other embodiments, propynylated tags may be used in a manner similar to that of the non-template tag, wherein two or more 5-propynylcytidine or 5-propynyluridine residues replace template matching residues on a primer. In other embodiments, a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.

In some embodiments, the primers contain mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon (vide infra) from its molecular mass.

In some embodiments of the present invention, the mass modified nucleobase comprises one of the following: 7-deaza-2′-deoxyadenosine-5-triphosphate, 5-iodo-2′-deoxyuridine-5′-triphosphate, 5-bromo-2′-deoxyuridine-5′-triphosphate, 5-bromo-2′-deoxycytidine-5′-triphosphate, 5-iodo-2′-deoxycytidine-5′-triphosphate, 5-hydroxy-2′-deoxyuridine-5′-triphosphate, 4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate, 5-fluoro-2′-deoxyuridine-5′-triphosphate, O6-methyl-2′-deoxyguanosine-5′-triphosphate, N2-methyl-2′-deoxyguanosine-5′-triphosphate, 8-oxo-2′-deoxyguanosine-5′-triphosphate or thiothymidine-5′-triphosphate. In some embodiments, the mass-modified nucleobase comprises ¹⁵N or ¹³C or both ¹⁵N and ¹³C.

In some embodiments, bioagent identifying amplicons amenable to molecular mass determination which are produced by the primers described herein are either of a length, size or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination. Such means of providing a predictable fragmentation pattern of an amplification product include, but are not limited to, cleavage with restriction enzymes or cleavage primers, for example. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.

In some embodiments, amplification products corresponding to bacterial bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) which is a routine method to those with ordinary skill in the molecular biology arts. Other amplification methods may be used such as ligase chain reaction (LCR), low-stringency single primer PCR, and multiple strand displacement amplification (MDA) which are also well known to those with ordinary skill.

In the context of this invention, a “bioagent” is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus. Examples of bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered. In the context of this invention, a “pathogen” is a bioagent which causes a disease or disorder.

In the context of this invention, the term “unknown bioagent” may mean either: (i) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed, or (ii) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003). For example, if the method for identification of coronaviruses disclosed in commonly owned U.S. Ser. No. 10/829,826 (incorporated herein by reference in entirety) was to be employed prior to April 2003 to identify the SARS coronavirus in a clinical sample, both meanings of “unknown” bioagent are applicable since the SARS coronavirus was unknown to science prior to April, 2003 and since it was not known what bioagent (in this case a coronavirus) was present in the sample. On the other hand, if the method of U.S. Ser. No. 10/829,826 was to be employed subsequent to April 2003 to identify the SARS coronavirus in a clinical sample, only the first meaning (i) of “unknown” bioagent would apply since the SARS coronavirus became known to science subsequent to April 2003 and since it was not known what bioagent was present in the sample.

In those embodiments wherein the bioagent is an RNA virus, the RNA of the virus is reverse transcribed to obtain corresponding DNA which can be subsequently amplified by procedures referred to above. In one embodiment, one means of reverse transcription is reverse transcriptase, an enzyme well known in the molecular biology arts.

The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as “triangulation identification.” Triangulation identification is pursued by analyzing a plurality of bioagent identifying amplicons selected within multiple core genes. This process can be used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.

In some embodiments, the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures. Such multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids.

In some embodiments, the molecular mass of a given bioagent identifying amplicon is determined by mass spectrometry. Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z). Thus mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass. The current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample. An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.

In some embodiments, intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase. These ionization methods include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB). Upon ionization, several peaks are observed from one sample due to the formation of ions with different charges. Averaging the multiple readings of molecular mass obtained from a single mass spectrum affords an estimate of molecular mass of the bioagent identifying amplicon. Electrospray ionization mass spectrometry (ESI-MS) is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation.

The mass detectors used in the methods of the present invention include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), ion trap, quadrupole, magnetic sector, time of flight (TOF), Q-TOF, and triple quadrupole.

In some embodiments, conversion of molecular mass data to a base composition is useful for certain analyses. As used herein, a “base composition” is the exact number of each nucleobase (A, T, C and G). For example, amplification of nucleic acid of Neisseria meningitidis with a primer pair that produces an amplification product from nucleic acid of 23S rRNA that has a molecular mass (sense strand) of 28480.75124, from which a base composition of A25 G27 C22 T18 is assigned from a list of possible base compositions calculated from the molecular mass using standard known molecular masses of each of the four nucleobases.

In some embodiments, assignment of base compositions to experimentally determined molecular masses is accomplished using “base composition probability clouds.” Base compositions, like sequences, vary slightly from isolate to isolate within species. It is possible to manage this diversity by building “base composition probability clouds” around the composition constraints for each species. This permits identification of organisms in a fashion similar to sequence analysis. Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.

In some embodiments, base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions. In other embodiments, base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence. Thus, in contrast to probe-based techniques, mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.

The present invention provides bioagent classifying information similar to DNA sequencing and phylogenetic analysis at a level sufficient to detect and identify a given bioagent. Furthermore, the process of determination of a previously unknown base composition for a given bioagent (for example, in a case where sequence information is unavailable) has downstream utility by providing additional bioagent indexing information with which to populate base composition databases. The process of future bioagent identification is thus greatly improved as more BCS indexes become available in base composition databases.

The present invention also provides kits for carrying out the methods described herein. In some embodiments, the kit may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon. In some embodiments, the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs. In some embodiments, the kit may comprise one or more primer pairs recited in Table 1.

In some embodiments, the kit may comprise broad range survey primers, division wide primers, clade group primers or drill-down primers, or any combination thereof. A kit may be designed so as to comprise particular primer pairs for identification of a particular bioagent. For example, a broad range survey primer kit may be used initially to identify an unknown bioagent as a member of the Bacillus/Clostridia group. Another example of a division-wide kit may be used to distinguish Bacillus anthracis, Bacillus cereus and Bacillus thuringiensis from each other. A drill-down kit may be used, for example, to identify genetically engineered Bacillus anthracis. In some embodiments, any of these kits may be combined to comprise a combination of broad range survey primers and division-wide primers so as to be able to identify the species of an unknown bioagent.

In some embodiments, the kit may contain standardized nucleic acids for use as internal amplification calibrants.

In some embodiments, the kit may also comprise a sufficient quantity of reverse transcriptase (if an RNA virus is to be identified for example), a DNA polymerase, suitable nucleoside triphosphates (including any of those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above. A kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method. A kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like. A kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads. A kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.

While the present invention has been described with specificity in accordance with certain of its embodiments, the following examples serve only to illustrate the invention and are not intended to limit the same. Throughout these examples, molecular cloning reactions, and other standard recombinant DNA techniques, may be carried out according to methods described in Maniatis et al., Molecular Cloning—A Laboratory Manual, 2nd ed., Cold Spring Harbor Press (1989), using commercially available reagents, except where otherwise noted.

EXAMPLES Example 1 Design of Calibrant Polynucleotides Based on Viral Bioagent Identifying Amplicons from the SARS Coronavirus (Viral Bioagent Identifying Amplicons)

This example describes the design of two coronavirus calibrant polynucleotides based on viral bioagent identifying amplicons for identification of coronaviruses (viral bioagent identifying amplicons) in the RNA-dependent RNA polymerase (RdRp) gene and in the nspl1 gene which are described in a method for identification of coronaviruses disclosed in U.S. application Ser. No. 10/829,826. The primers used to define the viral bioagent identifying amplicons hybridize to regions of the RdRp gene (primer pair no. 453: forward—TAAGU^(a)U^(a)TU^(a)ATGGCGGCU^(a)GG (SEQ ID NO: 1) and reverse—TTTAGGATAGTC^(a)C^(a)C^(a)AACCCAT (SEQ ID NO: 2)) and the nspl1 gene (primer pair no. 455: forward—TGTTTGU^(a)U^(a)U^(a)U^(a)GGAATTGTAATGTTGA (SEQ ID NO: 3) and reverse—TGGAATGCATGCU^(a) U^(a)AU^(a)U^(a)AACATACA (SEQ ID NO: 4)), wherein U^(a) represents=5-propynyluracil and C^(a) represents 5-propynylcytosine). The calibration sequence chosen to simulate the RdRp calibration amplicon is SEQ ID NO: 5 which corresponds to positions 15146 to 15233 of NC_(—)004718.3 (SARS coronavirus TOR2 genome) with deletion of positions 15179-15183 to yield a calibration amplicon length of 83 bp. The calibration sequence for the nspl1 calibration amplicon is SEQ ID NO: 6, which corresponds to positions 19113 to 19249 of NC_(—)004718.3 (SARS coronavirus TOR2 genome) with deletion of positions 19172-19176 to yield a calibration amplicon of 132 bp length. Both calibrant standard sequences (SEQ ID NOs: 5 and 6) were included on a single polynucleotide (SEQ ID NO: 7—herein designated a “combination calibration polynucleotide”) which was cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, Calif.). Thus, when the combination calibration polynucleotide is added to an amplification reaction, an RdRp-based calibration amplicon will be produced in an amplification reaction with primer pair 453 (SEQ ID NOs: 1:2) and an nspl1-based calibration amplicon will be produced with primer pair 455 (SEQ ID NOs: 3:4).

The viral bioagent identifying amplicons are used as identifiers of coronaviruses due to the variable regions between the conserved priming regions which can be distinguished by mass spectrometry. The calibration polynucleotides are used to produce calibration amplicons from which the quantity of identified coronavirus is determined.

Example 2 Use of a Calibration Polynucleotide for Determining the Quantity of Coronavirus in a Clinical Sample

To determine the quantity of SARS coronavirus in a clinical sample, viable SARS coronavirus was added to human serum and analyzed. The TOR2 isolate of the SARS coronavirus from three passages in Vero cells was titered by plaque assay. Virus was handled in a P3 facility by investigators wearing forced air respirators. Equipment and supplies were decontaminated with 10% hypochlorite bleach solution for a minimum of 30 minutes or by immersion in 10% formalin for a minimum of 12 hours and virus was handled in strict accordance with specific Scripps Research Institute policy. SARS coronavirus was cultured in sub confluent Vero-E6 cells at 37° C., 5% CO₂ in complete DMEM with final concentrations of 10% fetal bovine serum (Hyclone, Salt Lake City, Utah), 292 μg/mL L-Glutamine, 100 U/mL penicillin G sodium, 100 μg/mL streptomycin sulfate (Invitrogen, Carlsbad Calif.), and 10 mM HEPES (Invitrogen, Carlsbad Calif.). Virus-containing medium was collected during the peak of viral cytopathic effects, 48 h after inoculation with approximately 10 PFU/cell of SARS coronavirus from the second passage of stock virus. Infectious virus was titered by plaque assay. Monolayers of Vero-E6 cells were prepared at 70-80% confluence in tissue culture plates. Serial tenfold dilutions of virus were prepared in complete DMEM. Medium was aspirated from cells, replaced by 200 μL of inoculum, and cells were incubated at 37° C., 5% CO₂ for 1 hour. Cells were overlaid with 2-3 mL/well of 0.7% agarose, 1×DMEM overlay containing 2% fetal bovine serum. Agarose was allowed to solidify at room temperature then cells were incubated at 37° C., 5% CO₂ for 72 h. Plates were decontaminated by overnight formalin immersion, agarose plugs were removed, and cells were stained with 0.1% crystal violet to highlight viral plaques.

RNA was isolated from serum containing two different concentrations of the virus (1.7×10⁵ and 170 PFU/mL) and reverse transcribed to cDNA using random primers and reverse transcriptase. A PFU (plate forming unit) is a quantitative measure of the number of infectious virus particles in a given sample, since each infectious virus particle can give rise to a single clear plaque on infection of a continuous “lawn” of bacteria or a continuous sheet of cultured cells. PCR amplifications were performed using both the RdRp and the nspl1 primer sets on serial ten-fold dilutions of these cDNAs. Amplification products were purified and analyzed by methods commonly owned and disclosed in U.S. application Ser. Nos. 10/829,826 and 10/844,938, each of which is incorporated herein by reference in its entirety. The limit of SARS coronavirus detection was 10-2 PFU per PCR reaction (˜1.7 PFU/mL serum). Since PFU reflects the number of infectious viral particles and not the total number of RNA genomes, the number of reverse-transcribed SARS genomes was estimated by competitive, quantitative PCR using a calibration polynucleotide. Analysis of ratios of mass spectral peak heights of titrations of the calibration polynucleotide and the SARS cDNA showed that approximately 300 reverse-transcribed viral genomes were present per PFU, similar to the ratio of viral genome copies per PFU reported for RNA viruses (J. S. Towner et al., J Virol In Press (2004)). Using this estimate, the PCR primers were sensitive to three genomes per PCR reaction, consistent with previously reported detection limits for optimized SARS-specific primers (Drosten et al., New England Journal of Medicine, 2003, 348, 1967). When RT-PCR products were measured for varying dilutions of the SARS virus spiked directly into serum, 1 PFU (˜300 genomes) per PCR reaction or 170 PFU (5.1×104 genomes) per mL serum could be reliably detected. The discrepancy between the detection sensitivities in the two experimental protocols described above suggests that there were losses associated with RNA extraction and reverse transcription when very little virus was present (<300 copies) in the starting sample in serum.

To determine the relationship between PFU and copies of nucleic acid target, the virus stock was analyzed using the methods of the present invention. Synthetic DNA templates with nucleic acid sequence identical in all respects to RdRp-based (SEQ ID NO: 5) and nspl1-based (SEQ ID NO: 6) viral bioagent identifying amplicons for the SARS coronavirus with the exception of 5 base deletions internal to each amplicon were combined into a single combination calibration polynucleotide (SEQ ID NO: 7) and cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, Calif.) to produce a calibration polynucleotide. The calibrant plasmid stock solutions were quantified using OD₂₆₀ measurements, serially diluted (10-fold dilutions), and mixed with a fixed amount of post-reverse transcriptase cDNA preparation of the virus stock and analyzed by competitive PCR and electrospray mass spectrometry. Each PCR reaction produced two sets of amplicons, one corresponding to the calibrant amplicon and the other to the viral bioagent identifying amplicon. Since the primers hybridize to both the calibration polynucleotide and the coronavirus cDNA, it was reasonably assumed that the calibration polynucleotide and coronavirus cDNA would have similar PCR efficiencies for amplification of the two products. Analysis of the ratios of peak heights (abundance data) of the resultant mass spectra of the calibration amplicons DNA and viral bioagent identifying amplicons used to determine the amounts of nucleic acid copies (as measured by calibrant molecules) present per PFU. Since all of the extracted RNA was used in the reverse transcriptase step to produce the viral cDNA, the approximate amount of nucleic acids associated with infectious virus particles in the original viral preparation could be estimated. Mass spectrometry analysis showed an approximate 1:1 peak abundance between the calibrant peak at the 3×10⁴ copy number dilution and the viral bioagent identifying amplicon peak for the RdRp primer set (FIG. 2). Thus, the relationship between PFU and copies of nucleic acid was calculated to be 1 PFU=300 copies of nucleic acid.

The calibration sequences described in this example are appropriate for use in production of calibration amplicons which are in turn useful for determining the quantity of all known members of the coronavirus family. Further, it is reasonably expected that these calibration sequences will likewise be appropriate for quantification of any coronaviruses that are yet to be discovered.

Example 3 Design of Calibrant Polynucleotides based on Bioagent Identifying Amplicons for Identification of Species of Bacteria (Bacterial Bioagent Identifying Amplicons)

This example describes the design of 19 calibrant polynucleotides based on broad range bacterial bioagent identifying amplicons. The bacterial bioagent identifying amplicons are obtained upon amplification of bacterial nucleic acid with primers (Table 1) that have been disclosed in U.S. patent application Ser. Nos. 10/660,122, 10,728,486, and 60/559,754, each of which is commonly owned and incorporated herein by reference in its entirety.

Calibration sequences were designed to simulate bacterial bioagent identifying amplicons produced by the primer pairs shown in Table 1. The calibration sequences were chosen as a representative member of the section of bacterial genome from specific bacterial species which would be amplified by a given primer pair. The model bacterial species upon which the calibration sequences are based are also shown in Table 1. For example, the calibration sequence chosen to correspond to an amplicon produced by primer pair no. 346 is SEQ ID NO: 8. In Table 1, the forward (_F) or reverse (_R) primer name indicates the coordinates of an extraction representing a gene of a standard reference bacterial genome to which the primer hybridizes e.g.: the forward primer name 16S_EC_(—)713_(—)732_TMOD_F indicates that the forward primer hybridizes to residues 713-732 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence (in this case, the reference sequence (SEQ ID NO: 66 in Table 2) is an extraction consisting of residues 4033120-4034661 of the genomic sequence of E. coli K12 (GenBank Accession No. NC_(—)000913)—See Table 2. Additional gene coordinate reference information is shown in Table 2. The designation “TMOD” in the primer names indicates that the 5′ end of the primer has been modified with a non-matched template T residue. This modification prevents the PCR polymerase from adding non-templated adenosine residues to the 5′ end of the amplification-product, an occurrence which may result in miscalculation of base composition from molecular mass data.

The 19 calibration sequences shown in Table 1 were combined into a single calibration polynucleotide sequence (SEQ ID NO: 9—which is herein designated a “combination calibration polynucleotide”) which was then cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, Calif.). This combination calibration polynucleotide can be used in conjunction with the primers of Table 1 as an internal standard to produce calibration amplicons for use in determination of the quantity of any bacterial bioagent. Thus, for example, when the combination calibration polynucleotide vector is present in an amplification reaction mixture, a calibration amplicon based on primer pair 346 (16S rRNA) will be produced in an amplification reaction with primer pair 346 and a calibration amplicon based on primer pair 363 (rpoC) will be produced with primer pair 363.

TABLE 1 Bacterial Primer Pairs for Production of Bacterial Bioagent Identifying Amplicons and Corresponding Representative Calibration Sequences Forward Reverse Calibration Primer Primer Calibration Sequence Primer (SEQ ID (SEQ ID Sequence Model (SEQ ID Pair No. Forward Primer Name NO:) Reverse Primer Name NO:) Species NO:) 346 16S_EC_713_732_TMOD_F 10 16S_EC_789_809_TMOD_R 11 Bacillus 8 anthracis 347 16S_EC_785_806_TMOD_F 12 16S_EC_880_897_TMOD_R 13 Bacillus 14 anthracis 348 16S_EC_960_981_TMOD_F 15 16S_EC_1054_1073_TMOD_R 16 Bacillus 17 anthracis 349 23S_EC_1826_1843_TMOD_F 18 23S_EC_1906_1924_TMOD_R 19 Bacillus 20 anthracis 350 CAPC_BA_274_303_TMOD_F 21 CAPC_BA_349_376_TMOD_R 22 Bacillus 23 anthracis 351 CYA_BA_1353_1379_TMOD_F 24 CYA_BA_1448_1467_TMOD_R 25 Bacillus 26 anthracis 352 INFB_EC_1365_1393_TMOD_F 27 INFB_EC_1439_1467_TMOD_R 28 Bacillus 29 anthracis 353 LEF_BA_756_781_TMOD_F 30 LEF_BA_843_872_TMOD_R 31 Bacillus 32 anthracis 354 RPOC_EC_2218_2241_TMOD_F 33 RPOC_EC_2313_2337_TMOD_R 34 Bacillus 35 anthracis 355 SSPE_BA_115_137_TMOD_F 36 SSPE_BA_197_222_TMOD_R 37 Bacillus 38 anthracis 356 RPLB_EC_650_(——)679_TMOD_F 39 RPLB_EC_739_762_TMOD_R 40 Clostridium 41 botulinum 358 VALS_EC_1105_1124_TMOD_F 42 VALS_EC_1195_1218_TMOD_R 43 Yersinia 44 Pestis 359 RPOB_EC_1845_1866_TMOD_F 45 RPOB_EC_1909_1929_TMOD_R 46 Yersinia 47 Pestis 360 23S_EC_2646_2667_TMOD_F 48 23S_EC_2745_2765_TMOD_R 49 Bacillus 50 anthracis 361 16S_EC_1090_1111_2_TMOD_F 51 16S_EC_1175_1196_TMOD_R 52 Bacillus 53 anthracis 362 RPOB_EC_3799_3821_TMOD_F 54 RPOB_EC_3862_3888_TMOD_R 55 Burkholderia 56 mallei 363 RPOC_EC_2146_2174_TMOD_F 57 RPOC_EC_2227_2245_TMOD_R 58 Burkholderia 59 mallei 367 TUFB_EC_957_979_TMOD_F 60 TUFB_EC_1034_1058_TMOD_R 61 Burkholderia 62 mallei 449 RPLB_EC_690_710_F 63 RPLB_EC_737_758_R 64 Clostridium 65 botulinum

TABLE 2 Primer Pair Gene Coordinate References and Calibration Polynucleotide Sequence Coordinates within the Combination Calibration Polynucleotide Gene Extraction Coordinates of Calibration Gene Coordinates of GenBank Accession No. Sequence in Combination Primer Extraction Genomic or Plasmid of Genomic (G) or Calibration Polynucleotide Bacterial Gene Pair No. SEQ ID NO: Sequence Plasmid (P) Sequence (SEQ ID. NO: 9) 16S E. coli 346 66 4033120 . . . 4034661 NC_000913 (G)  16 . . . 109 16S E. coli 347 66 4033120 . . . 4034661 NC_000913 (G)  83 . . . 190 16S E. coli 348 66 4033120 . . . 4034661 NC_000913 (G) 246 . . . 353 16S E. coli 361 66 4033120 . . . 4034661 NC_000913 (G) 368 . . . 469 23S E. coli 349 67 4166220 . . . 4169123 NC_000913 (G) 743 . . . 837 23S E. coli 360 67 4166220 . . . 4169123 NC_000913 (G) 865 . . . 981 rpoB E. coli. 359 68 4178823 . . . 4182851 NC_000913 (G) 1591 . . . 1672 (complement strand) rpoB E. coli 362 68 4178823 . . . 4182651 NC_000913 (G) 2081 . . . 2167 (complement strand) rpoC E. coli 354 69 4182928 . . . 4187151 NC_000913 (G) 1810 . . . 1926 rpoC E. coli 363 69 4182928 . . . 4187151 NC_000913 (G) 2183 . . . 2279 infB E. coli 352 70 3313655 . . . 3310983 NC_000913 (G) 1692 . . . 1791 (complement strand) tufB E. coli 367 71 4173523 . . . 4174707 NC_000913 (G) 2400 . . . 2498 rplB E. coli 356 72 3449001 . . . 3448180 NC_000913 (G) 1945 . . . 2060 rplB E. coli 449 72 3449001 . . . 3448180 NC_000913 (G) 1986 . . . 2055 valS E. coli 358 73 4481405 . . . 4478550 NC_000913 (G) 1462 . . . 1572 (complement strand) capC 350 74 56074 . . . 55628 AF188935 (P) 2517 . . . 2616 B. anthracis (complement strand) Cya 351 75 156626 . . . 154288 AF065404 (P) 1338 . . . 1449 B. anthracis (complement strand) Lef 353 76 127442 . . . 129921 AF065404 (P) 1121 . . . 1234 B. anchracis sspE 355 77 226496 . . . 226783 AE017025 (G) 1007-1104 B. anthracis

Example 4 Use of a Calibration Polynucleotide for Determining the Quantity of Bacillus Anthracis in a Sample Containing a Mixture of Microbes

The capC gene is a gene involved in capsule synthesis which resides on the pX02 plasmid of Bacillus anthracis. Primer pair no. 350 (see Tables 1 and 2) was designed to identify Bacillus anthracis via production of a bacterial bioagent identifying amplicon. Known quantities of the combination calibration polynucleotide vector described in Example 3 were added to amplification mixtures containing bacterial bioagent nucleic acid from a mixture of microbes which included the Ames strain of Bacillus anthracis. Upon amplification of the bacterial bioagent nucleic acid and the combination calibration polynucleotide vector with primer pair no. 350, bacterial bioagent identifying amplicons and calibration amplicons were obtained and characterized by mass spectrometry. A spectrum of an amplified nucleic acid mixture containing the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide vector which includes the CapC calibration sequence for Bacillus anthracis and primer pair 350 is shown in FIG. 3. The molecular masses of the bioagent identifying amplicons provided the means for identification of the bioagent from which they were obtained (Ames strain of Bacillus anthracis) and the molecular masses of the calibration amplicons provided the means for their identification as well. The relationship between the abundance (peak height) of the calibration amplicon signals and the bacterial bioagent identifying amplicon signals provides the means of calculation of the copies of the pX02 plasmid of the Ames strain of Bacillus anthracis. Methods of calculating quantities of molecules based on internal calibration procedures are well known to those of ordinary skill in the art.

Calibration amplicons and bacterial bioagent identifying amplicons produced in the reaction are visible in the mass spectrum as indicated and abundance (peak height) data are used to calculate the quantity of the pX02 plasmid of the Ames strain of Bacillus anthracis in the sample. Averaging the results of 10 repetitions of the experiment described above, enabled a calculation that indicated that the quantity of Ames strain of Bacillus anthracis present in the sample corresponds to approximately 10 copies of pX02 plasmid.

Various modifications of the invention, in addition to those described herein, will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims. Each reference (including, but not limited to, journal articles, U.S. and non-U.S. patents, patent application publications, international patent application publications, gene bank accession numbers, and the like) cited in the present application is incorporated herein by reference in its entirety. 

1. A method for simultaneous analysis of the identity and quantity of a bioagent in a sample comprising: contacting nucleic acid of said sample with: a pair of primers designed to produce a bioagent identifying amplicon from said nucleic acid under amplification conditions; and a known quantity of a calibration polynucleotide comprising a calibration sequence designed to produce a calibration amplicon as a result of amplification with said primers under said amplification conditions; concurrently amplifying said nucleic acid and said calibration sequence with said pair of primers in the same amplification mixture to obtain a first amplification product comprising a bioagent identifying amplicon and a second amplification product comprising a calibration amplicon; obtaining molecular mass data and abundance data for said first and second amplification products in said amplification mixture wherein the 5′ and 3′ ends of said amplification products are the sequences of said pair of primers or complements thereof; distinguishing said first and second amplification products based on their respective molecular masses; comparing the molecular mass of said first amplification product with a database of molecular masses of bioagent identifying amplicons for known bioagents wherein a match between the molecular mass of said first amplification product and a molecular mass of a bioagent identifying amplicon for a known bioagent in a database indicates the identity of said bioagent, and a molecular mass of said second amplification product identifies said calibration amplicon; and calculating the quantity of said bioagent from said abundance data of said first and second amplification products.
 2. A method for determining the identity and quantity of a bioagent in a sample comprising: contacting said sample with a pair of primers and a known quantity of a calibration polynucleotide comprising a calibration sequence; concurrently amplifying nucleic acid from said bioagent in said sample with said pair of primers and amplifying nucleic acid from said calibration polynucleotide in said sample with said pair of primers to obtain a first amplification product comprising a bioagent identifying amplicon and a second amplification product comprising a calibration amplicon; obtaining molecular mass and abundance data for said bioagent identifying amplicon and for said calibration amplicon wherein the 5′ and 3′ ends of said bioagent identifying amplicon and said calibration amplicon are the sequences of said pair of primers or complements thereof; and distinguishing said bioagent identifying amplicon from said calibration amplicon based on their respective molecular masses; and comparing a molecular mass of said bioagent identifying amplicon with a database of molecular masses of bioagent identifying amplicons for known bioagents wherein a match between a molecular mass of said first amplification product and a molecular mass of a bioagent identifying amplicon for a known bioagent in a database indicates the identity of said bioagent, and a molecular mass of said second amplification product identifies said calibration amplicon, and wherein comparison of bioagent identifying amplicon abundance data and calibration amplicon abundance data indicates the quantity of bioagent in said sample.
 3. The method of claim 2 wherein said bioagent is a bacterium or a virus.
 4. The method of claim 2 wherein said calibration sequence comprises a chosen standard sequence of a bioagent identifying amplicon with the exception of a deletion of about 2 to about 8 consecutive nucleotide residues of said standard sequence.
 5. The method of claim 2 wherein said calibration sequence comprises a chosen standard sequence of a bioagent identifying amplicon with the exception of an insertion of about 2 to about 8 consecutive nucleotide residues of said standard sequence.
 6. The method of claim 2 wherein said calibration sequence has at least 80% sequence identity with a chosen standard sequence of a bioagent identifying amplicon.
 7. The method of claim 2 wherein said calibration polynucleotide is present within a vector.
 8. The method of claim 2 wherein said molecular mass is obtained by mass spectrometry.
 9. The method of claim 2 wherein said molecular mass is obtained by ESI-FTICR or ESI-TOF mass spectrometry.
 10. The method of claim 2 comprising construction of a standard curve wherein the amount of said calibration polynucleotide in said sample is varied.
 11. The method of claim 2 comprising multiplex amplification wherein a plurality bioagent identifying amplicons are amplified with a plurality of primer pairs which amplify corresponding calibration sequences.
 12. The method of claim 11 wherein said plurality of primer pairs comprise survey primers, division-wide primers, clade group primers and sub-species characteristic primers.
 13. The method of claim 2 comprising amplification of a plurality of bioagent identifying amplicons within a plurality of core genes with a plurality of primer pairs.
 14. The method of claim 13 wherein said primer pairs hybridize to conserved regions of nucleic acid of genes encoding 16S and 23S rRNAs, RNA polymerase subunits, t-RNA synthetases, elongation factors, ribosomal proteins, protein chain initiation factors, cell division proteins, chaperonin groEL, chaperonin dnaK, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, and DNA topoisomerases.
 15. A method for simultaneous analysis of the identity and quantity of a bioagent in a sample comprising: contacting nucleic acid of said sample with: a pair of primers designed to produce a bioagent identifying amplicon from said nucleic acid under amplification conditions; and a known quantity of a calibration polynucleotide comprising a calibration sequence designed to produce a calibration amplicon as a result of amplification with said primers under said amplification conditions; concurrently amplifying said nucleic acid and said calibration sequence with said pair of primers in the same amplification mixture to obtain a first amplification product comprising a bioagent identifying amplicon and a second amplification product comprising a calibration amplicon; obtaining molecular mass data and abundance data for said first and second amplification products in said amplification mixture wherein the 5′ and 3′ ends of said amplification products are the sequences of said pair of primers or complements thereof; determining the base compositions of said first and second amplification products based on their respective molecular masses; comparing a base composition of said first amplification product with a database of base compositions of bioagent identifying amplicons for known bioagents wherein a match between a base composition of an amplification product and a base composition of a bioagent identifying amplicon for a known bioagent in a database of base compositions indicates the identity of said bioagent, and a base composition of said second amplification product identifies said calibration amplicon; and calculating the quantity of said bioagent from said abundance data of said first and second amplification products.
 16. A method for determining the identity and quantity of a bioagent in a sample comprising: contacting said sample with a pair of primers and a known quantity of a calibration polynucleotide comprising a calibration sequence; concurrently amplifying nucleic acid from said bioagent in said sample with said pair of primers and amplifying nucleic acid from said calibration polynucleotide in said sample with said pair of primers to obtain a first amplification product comprising a bioagent identifying amplicon and a second amplification product comprising a calibration amplicon; obtaining molecular mass and abundance data for said bioagent identifying amplicon and for said calibration amplicon wherein the 5′ and 3′ ends of said bioagent identifying amplicon and said calibration amplicon are the sequences of said pair of primers or complements thereof; determining the base compositions of said bioagent identifying amplicon and said calibration amplicon based on their respective molecular masses; comparing a base composition of said bioagent identifying amplicon with a database of base compositions of bioagent identifying amplicons for known bioagents wherein a match between a base composition of said bioagent identifying amplicon and a base composition of a bioagent identifying amplicon for a known bioagent in a database of base compositions indicates the identity of said bioagent, and a base composition of said second amplification product identifies said calibration amplicon, and wherein comparison of bioagent identifying amplicon abundance data and calibration amplicon abundance data indicates the quantity of bioagent in said sample.
 17. The method of claim 16 wherein said bioagent is a bacterium or a virus.
 18. The method of claim 16 wherein said calibration sequence comprises a chosen standard sequence of a bioagent identifying amplicon with the exception of a deletion of about 2 to about 8 consecutive nucleotide residues of said standard sequence.
 19. The method of claim 16 wherein said calibration sequence comprises a chosen standard sequence of a bioagent identifying amplicon with the exception of an insertion of about 2 to about 8 consecutive nucleotide residues of said standard sequence.
 20. The method of claim 16 wherein said calibration sequence has at least 80% sequence identity with a chosen standard sequence of a bioagent identifying amplicon.
 21. The method of claim 16 wherein said calibration polynucleotide is present within a vector.
 22. The method of claim 16 wherein said molecular mass is obtained by mass spectrometry.
 23. The method of claim 16 wherein said molecular mass is obtained by ESI-FTICR or ESI-TOF mass spectrometry.
 24. The method of claim 16 comprising construction of a standard curve wherein the amount of said calibration polynucleotide in said sample is varied.
 25. The method of claim 16 comprising multiplex amplification wherein a plurality of bioagent identifying amplicons are amplified with a plurality of primer pairs which amplify corresponding calibration sequences.
 26. The method of claim 25 wherein said plurality of primer pairs comprise survey primers, division-wide primers, clade group primers and sub-species characteristic primers.
 27. The method of claim 16 comprising amplification of a plurality of bioagent identifying amplicons within a plurality of core genes with a plurality of primer pairs.
 28. The method of claim 27 wherein said primer pairs hybridize to conserved regions of nucleic acid of genes encoding 16S and 23S rRNAs, RNA polymerase subunits, t-RNA synthetases, elongation factors, ribosomal proteins, protein chain initiation factors, cell division proteins, chaperonin groEL, chaperonin dnaK, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, and DNA topoisomerases.
 29. The method of claim 16 wherein said base compositions are determined from said molecular masses by base composition probability cloud analysis.
 30. The method of claim 25 wherein said plurality of primer pairs are selected by base composition cloud analysis. 